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METHODS FOR ASSESSING BIOLOGIC DIVERSITY 

TECHNICAL FIELD 

* 

This invention relates to methods of assessing biologic diversity, and more particularly to 
using random nucleic acid molecules for assessing biologic diversity. 

BACKGROUND 

5 The ability to mount an immune defense against infectious microorganisms and their 

products, tumors and other environmental challenges, is believed to be a direct function of 
lymphocyte diversity. See Silins et al.. Blood, 98:3739-44 (2001); and Clemente et al.. Lab. 
Invest. 78:619-27 (1998). AA^le Ae total number of lymphocytes in the blood can be measured 
with precision, the diversity of the T cell compartment, on which immunoconq^etence is based, 

10 cannot. 

In the absence of direct measures of lymphocyte diversity, various indirect methods for 
estimating diversity have been used. For example, antibodies against variable (V)-region 
families have been used to characterize lymphocyte populations by flow cytometric analysis. 
Sheehan et al., Embo J. 8:23 13-20 (1989); Langerak et al.. Blood 98: 165-73 (2001). This 
15 approach detects ^'constant" antigenic determinants shared by many lymphocyte receptor clones 
and diversity is infenred from the result. As another example, nucleic acids encoding lymphocyte 
receptors can be amplified by polymerase chain reaction (PGR) using constant region (C) and V 
family specific primers. Murata et al.. Arthritis Rheum. 46:2141-7 (2002). Like FACS analysis, 
this approach does not differentiate between individual clones of the same family and may fail to 

* 

20 detect balanced narrowing (or expansion) of the rq>ertoire. 

Diversity can also be estimated by spectratyping or immunoscope. See Pannetier et al., 
Proc. Natl. Acad. Sci, USA 90:4319-23 (1993); Pannetier et al., Immunol Todav 16:176-81 
(1995); and Delassus et al., J. tomunol. Methods 184:219-29 (1995). After V specific fiunilies 
are amplified by PGR, a fluorescently labeled junctional region (J) primer is used for a **run off' 

25 PGR reaction, the products of which can be separated on sequencing gels. Amplified 

lymphocyte receptor families (specified by the primers used in the initial PGR) migrate in a 
series of bands, each of which corresponds to a different length of the complementarity 
detemiining region 3 (GDR3 - T cell receptor (TGR) region believed to harbor the largest 

1 
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portion of genetic variability). In nonnal lymphocyte populations the CDR3 size distribution is 
Gaussian for each variable region family and so any alteration in distribution and/or band 
intensity is attributed to a perturi^ation of diversity. Unfortunately, V and J combinations cannot 
be analyzed routinely because over 4,684 V-J family combinations for human T cell receptors 
5 exist. Hence, only a small fraction of V-J combinations are analyzed, the choice of which is 
random and therefore may or may not represent the entire recq>tor population. In addition, 
spectratyping does not detect individual clones that may share the same V-J combination and the 
same CDR3 length. 

Still another method of measuring lymphocyte diversity is based on the tenets of limiting 
10 dilution analysis and detects the frequency of a given TCR clone. Wagner et al.. Pro. Natl. Acad. 
Sci. USA 95:14447-52 (1998). This mettiod is laborious and is based on the assumption that the 
frequency of the selected clone represents the frequency of all clones. Thus, a method that can 
directly and rapidly assess lymphocyte receptor diversity is needed. 

SUMMARY 

15 The invention is based on methods for estimating biologic diversity using random nucleic 

acid molecules. For example, methods described herein can be used to identify and/or quantify 
heterogeneous populations of viruses that are contained within an individual (i.e., viral 
quasispecies). Methods described herein also can be used to probe directly the entire population 
of lymphocyte receptors. The repertoire of lymphocyte receptor genes is established by 

20 rearrangement of gemiline DNA, resulting in >1 000-fold more diversify than the entire genome, 
and varies between genetically identical individuals. Methods of the invention include 
hybridizing labeled nucleic acid molecules from the biologic population to be assessed (e.g., 
vimses or lymphocyte receptors), with a population of random nucleic acid molecules. Diversity 
is assessed based on the hybridization of the two populations of nucleic acid molecules. As 

25 described herein, the frequency of hybridization of the labeled nucleic acids to the random 

nucleic acid molecules varies in direct proportion to diversify. Methods of the invention can be 
used clinically to diagnose immunodeficiency stemming from compression of lymphocyte 
repertoires or to monitor immune reconstitution following hematopoietic cell transplantation. In 
addition, viral quasispecies can be identified and quantified to guide therapeutic choices and 

30 make prognostic assessments. 



2 



Docket No. 07039-463P0I/ MMV-02-127 

In one aspect, the invention features a method for detennining lymphocyte diversity in a 
subject The method includes obtaining labeled nucleic acid molecules (e.g., RNA or DNA) 
from a population of the subject's lymphocytes (e.g., T or B lymphocytes), wherein each labeled 
nucleic acid molecule encodes a lymphocyte receptor or a portion thereof; hybridizing the 
labeled nucleic acid molecules or fragments of the labeled nucleic acid molecules with a 
population of random nucleic acid molecules; and determining lymphocyte diversity of the 
subject by assessing hybridization of the labeled nucleic acid molecules with the population of 
random nucleic acid molecules. The random nucleic acid molecules within the population can 
be attached to a solid substrate. For example, the random nucleic acid molecules can be attached 
to a bead and hybridization can be assessed by flow cytometry. The solid substrate can include a 
plurahty of discrete regions, wherein each of the discrete regions includes a different random . 
nucleic acid molecule. The labeled nucleic acid molecules can be labeled with a fluorochrome 
(e.g., fluorescein isothiocyanate (FITC), phycoerythrin (PE), allophycocyanin (APC), or 
peridinin chlorophyll protein (PerCP)), biotin, or an enzyme. Each labeled nucleic acid 
molecules can encode a variable region from a T cell receptor (e.g., a complementarity 
determining region (CDR) 3 p chain polypeptide or a variable portion from a heavy chain or a 
lig^t chain. 

The invention also features a method for monitoring a disease in a subject. The method 
includes obtaining labeled nucleic acid molecules from a population of the subject's 
lymphocytes, wherein each labeled nucleic acid molecule encodes a lymphocyte receptor or a 
portion thereof; hybridizing the labeled nucleic acid molecules or fragments of the labeled 
nucleic acid molecules with a population of random nucleic acid molecules; determining 
lymphocyte diversity of the subject by assessing hybridization of the labeled nucleic acid 
molecules with the population of random nucleic acid molecules; and comparing the subject's 
lymphocyte diversity with lymphocyte diversity of a control population, wherein an alteration in 
the subject's lymphocyte diversity relative to that of the control popiilation indicates a change in 
the disease. An increase in the subject's lymphocyte diversity can indicate a positive change in 
the disease. A decrease in the subject's lymphocyte diversity can indicate a negative change in 
the disease. The disease can be an autoimmune disorder (rheumatoid arthritis or multiple 
sclerosis), colitis, or a lymphoid disease (e.g., leukemia or lymphoma). 
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In another aspect, the invention features a method for determining viral diversity in a 
subject. The method includes obtaining labeled nucleic acid molecules from a biological sample 
of the subject, wherein the labeled nucleic acid molecules encode a viral polypeptide; 
hybridizing the labeled nucleic acid molecules or fragments of the labeled nucleic acid molecules 

5 with a population of random nucleic acid molecules; and detemiining viral div^ity of the 

subject by assessing hybridization of the labeled nucleic acid molecules with the population of 
random nucleic acid molecules. 

The invention also features an article of manufacture that includes a solid substrate, 
wherein the solid substrate includes random nucleic acid molecules immobilized thereto; and a 

10 primer for producing nucleic acid molecules encoding a lymphocyte recq>tor or a portion thereof 
or a primer for producing nucleic acid molecules encoding a viral polypeptide. The solid 
substrate can be a bead or a chip. 

Unless otherwise defined, all technical and scientific terms used herein have ttie same 
meaning as commonly understood by one of ordinary skill in the art to which this invention 

15 belongs. Although methods and materials similar or equivalent to those described herein can be 
used in the practice or testing of the present invention, suitable methods and materials are 
described below. In addition, the materials, methods, and examples are illustrative only and not 
intended to be limiting. All publications, patent applications, patents, and other references 
mentioned herein are incorporated by reference in their entirety. In case of conflict, the present 

20 specification, including definitions, will control. 

The details of one or more embodiments of the invention are set forth in the 
accompanying drawings and the description below. Other features, objects, and advantages of 
the invention will be apparent from the drawings and detailed description, and from the claims. 

DESCRIPTION OF DRAWINGS 
25 FIG 1 A and IB are graphs depicting the relationship between the number of hits (as 

defined as the number of gene chip sites imdergoing hybridization) and the number of variants. 
As indicated in FIG 1 A, the number of hits increases with the number of variants, indicating that 
the human gene chip can be used to detect random oligonucleotides. In FIG IB, the natural log 
of both axes yielded a linear relationship between hits and variants. 

4 
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t 

FIG 2 is a graph deleting the rq}roducibility of the method for ana];^s of receptor 
diversity. Samples from FIG 1 were studied in three sq>arate experiments to test r^roducibility. 
The slopes of the standard curves were the same statistically; the y intercept varied from 
experiment to experiment. 

5 FIG 3 is a graph depicting the relationship between the number of hits and the numb^ of 

variants for B cells using the gaie chip method.. Splenocytes were harvested from 3-4 week old 
JH-/-, MBT, QM and WT mice and mononuclear cells wctc isolated on Ficoll-paque gradients. 
Total RNA was isolated from the lymphocytes and first strand cDNA was generated using a 
primer designed to bind the constant region of the mouse heavy chain J region plus the T7 

10 polymerase promoter. The custom primer promoted amplification of heavy chain-specific RNA 
only. Equal amounts of the in vitro transcription product (cRNA) fiiom each mouse and 
standards (-•-) were hybridized to gene chips and then the chips were stained and analyzed as 
described in Figure 1. WT diversity (- A-) was more than two-fold higher than QM (-■-) 
diversity. MBT (-•-)diversitywas less &anl logarithmic unit. Background hybridization was 

15 established usmg JH -A RNA (-0-). 

FIG 4 is a graph depicting the analysis of human T cell diversity using gene chips. Two 
normal individuals ((-•-), (-o-)), two thymectomized individuals ((-■-)» (-a-)) and one individual 
with inflammatory bowel disease (IBD) (- A -) were analyzed. Background hybridization was 
established using Jurkat cell hybridization. 

20 DETAILED DESCRIPTION 

In general, the invention provides a method for the direct measurement of biologic 
diversity (e.g., lymphocyte diversity or viral quasispecies). As used herein, **viral quasispecies" 
refers to difierent, but closed related viral variants, within an individual. The method typically 
employs nucleic acid molecules from the biologic population to be assessed, wherein the nucleic 

25 acid molecules encode polypeptides containing one or more variable portions. As used herein, 
the term "polypeptide" refers to a chain of amino acids, regardless of length or post-translational 
modification. For example, to assess lymphocyte receptor diversity, each nucleic acid molecule 
can encode a T cell receptor CTCR) or a B cell receptor (BCR), or a variable portion thereof. To 
assess viral quasispecies, the nucleic acid can encode a viral polypeptide (e.g., a full-length 

30 sur&ce polypeptide or a variable portion thereof). Such nucleic acid molecules are labeled then 
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hybridized with a population of random nucleic acid molecules. Diversity is assessed by the 
hybridization of tiie two populations of nucleic acid molecules. 

Methods of the invention can be used to estimate diversity of the entire viral quasispecies 
or lymphocyte repertoire (i.e., all gene segment combinations) at once and is equally capable of 
measuring B cell and T cell diversity. Furthermore, the methods can be used to estimate 
diversity of a particular population of cells or immunoglobulins (e.g., IgG or IgM molecules). 
The method is sufficiently simple and effective to allow widespread application, including the 
monitoring of immunological disease in subjects, monitoring immune reconstitution following 
hematopoietic cell transplantation, determining suitable therapies for treatment of a viral 
infection, and determining prognosis. 

Nucleic Acid Molecules from Lymphocytes 

To assess lymphocyte receptor diversity, nucleic acid molecules encoding polypeptides 
containing one or more variable portions are used. Such nucleic acid molecules each can encode 
an a, P, y, or d chain of a TCR, or one or more variable portions from the a, p, 7, or 8 chain of a 
TCIL For example, a vsuiable portion from an a chain of a TCR can be encoded, for CTcample, 
• by one or more of the Van or Jdn variable gene segments. As used herein, "gene segmenf ' refers 
to a nucleic acid molecule encoding a variable (V), diversity (D), junctional (J), or other region 
of a TCR or BCR. Gene segments typically are separated from one another in the genome by 
large stretches of DNAthat are not transcribed. Gene segments are composed of coding 
sequences (exons) that are separated by introns. In some embodiments, the variable portion of 
the a chain is a hypervariable region (e.g., a complementarity determining region (CDR)) or a 
portion of a hypervariable region. Thus, in some embodiments, a nucleic acid molecule can 
encode a CDR, e.g., CDRl, 2, and/or 3, of an a chain or a portion of a CDR. CDRl and CDR2 
of the a chain are encoded by V geae segments; CDR3 is encoded by V and J gene segments. 

A variable portion from a p chain of a TCR can be mcoded, for example, by one or more 
of the Vpn or Jpn gene segments. In some embodiments, the variable portion of a p chain is a 
hypervariable region (e.g., a CDR) or a portion thereof. Thus, in some embodiments, a nucleic 
acid molecule can encode a CDR, e.g., CDRl, 2, and/or 3, of a P chain, or a portion of CDR. In 
the p chain, CDRl and CDR2 are encoded by V gene segments; CDR3 is encoded by V, D, and J 
gene segments. 
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Suitable nucleic add molecules also can encode a BCR or one or more variable portions 

* 

of a BCR (e.g., a variable portion of a heavy or light chain of an inimunoglobulin, including IgQ 
IgM» IgD, IgA, and IgE molecules). For example, a variable portion of a heavy chain can be 
encoded by one of the Vh> !>, or Jh gene segments. In some embodiments, the variable portion is 
a hypervariable region (e.g., CDRl, 2, and/or 3) or a portion of a hypervariable region. In a light 
chain (e.g., a k or A, chain), the variable portion can be encoded by any one of the Vicn or Vxn 
gene segments, or any one of ttie J, L, or JL gene segmaits. In some embodim^ts, the nucleic 
acid encodes a hypervariable region (e.g., CDRl, 2, and/or 3) of a light chain or a portion of a 
hypervariable region. 

Populations of nucleic acid molecules encoding a TCR or BCR, or a variable portion of a 
TCR or BCR, can be isolated fix>m mononuclear cells. In gmeral, a population of mononuclear 
cells can be obtained from a biological sample dien nucleic acids can be extracted Scorn the 
mononuclear cells. For example, blood (e.g., peripheral blood) or a tissue sample (e.g., biopsy) 
can be obtained from a subject (e.g., a human) and mononuclear cells isolated from such samples 
using known techniques. For example, density gradient separation medium (e.g., FicoU-Paque 
(Amersham Biosciences, Piscatanaway, NJ)) can be used to isolate mononuclear cells. 
Alternatively, negative or positive selection strategies can be used to obtain particular 
populations of lymphocytes (e.g., T cells or B cells). 

Nucleic acids can be obtained from the cells using known techniques. As used herein, 
"nucleic acid" refers to both RNA and DNA, including total RNA and genomic DNA. The 
nucleic acid can be double-stranded or single-stranded (i.e., a sense or an antisense single strand) 
and can be complementary to a nucleic acid encoding a polypeptide. Total RNA can be isolated 
from cells by lysing the cells with sodium dodecyl sulfate (SDS), treating the lysate with 
proteinase K, and isolating the RNA by extracting with a mixture of 25:24: 1 
phenol/chloroform/isoamyl alcohol and precipitating with sodium acetate and ethanol. In other 
methods, cells can be lysed with guanidinium (e.g., 4M guanidium, pH 5.5) and the RNA 
isolated by cesium chloride purification. Alternatively, total RNA can be extracted with kits such 
as the Qiagen RNeasy™ kit (Qiagen, Inc., Valencia, CA) or the PureScript™ kit (Centra 
Systems, Inc., Minneapolis, MN). 

Routine methods also can be used to extract genomi6 DNA from a blood or tissue 
sample, including, for example, phenol extraction. Altematively, genomic DNA can be 
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extracted with kits such as the QIAamp Tissue Kit (Qiag^ Chatsworth, CA), the Wizard^ 
Genomic DNA purification kit (Promega, Madisoii, WI), tiie Puregene DNA Isolation System 
(Centra Systems, Inc., Minneapolis, MN), and the A,S.A.P.™ Genomic DNA isolation kit 
(Boehringer Mannheim, Indianapolis, IN). 

5 To select nucleic acid molecules ^coding a TCR or a BCR, or a variable portion of a 

TCR or BCR, a prim^ that binds to a region of a TCR or a BCR can be used to produce an 
extension product in the presence of a polymerase and the ^propriate nucleotides. The primer 
can be designed such that upon extension of the primer, die resulting product encodes a variable 
portion of a TCR or a BCR. For example, in one embodiment, a primer that binds to a constant 

10 region of a TCR or BCR gene is annealed to a sample of total RNA and a complementary DNA 
(cDNA) is produced using reverse transcriptase and a mixture of deoxynucleotide triphosphates 
(dNTPs). A second strand can be synthesized using a DNA polymerase, a mixture of dNTPs, 
and the cDNA as a template. The doubled stranded cDNA product can be purified (e.g., gel- 
purified) and labeled as discussed below. A complementary RNA (cRNA) product can be 

15 produced fiom the double-stranded cDNA product using RNA polymerase and the appropriate 

ribonucleotides (NTPs). Alternatively, the second strand can be generated using a reverse primer 
specific to the region of interest. Primers described above can be designed to include particular 
promoter sequences to facilitate fiirther manipulations (e.g., in vitro transcription). 

In other embodiments, PCR is used to produce nucleic acids encoding a TCR or a BCR, 

20 or a variable region of a TCR or BCR. Conventional PCR techniques are disclosed iii U.S. 
Patent Nos. 4,683,202, 4,683,195, 4,800,159, and 4,965,188. See also, for example, PCR 
Primer: A Laboratory Manual, DieflFenbach and Dveksler (eds.). Cold Spring Harbor Laboratory 
Press, 1995) for standard PCR conditions. PCR typically employs two oligonucleotide primers 
that bind to a selected nucleic acid template (e.g., DNA or RNA, including messenger RNA). 

25 Template nucleic acid need not be purified; it can be a minor fraction of a complex mixture, such 
as microbial nucleic acid contained in mononuclear cells. 

Nucleic acid molecules encoding a TCR or BCR, or a variable portion of a TCR or BCR, 
are labeled, either directly or indirectly, with a detectable label. Typically, the label is 
incorporated throughout the nucleic acid molecule. For example, the nucleic acids can be 

30 labeled during in vitro synthesis of the nucleic acid molecules by incorporating modified dNTPs 
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or NTPs. Alternatively, techniques such as nick-translation or random priming can be used to 
label a nucleic acid throughout its length. 

Nucleic acid molecules can be labeled with an isotope such as ^^P or ^^S, a metallic label 
(e.g., colloidal gold), or can be non-radioactively labeled with a fluorescent nucleotide derivative 
5 such as ChromaTide™ (Molecular Probes, Inc., Eugene, OR). In addition, the nucleic acid 
molecule can be labeled with a fiuorophore such as 7-amino-4-methylcoumarin-3-acetic acid 
(AMCA), Texas Red™ (Molecular Probes, Inc., Eugene, OR), 5-(and-6)-caiboxy-X-rhodamine, 
lissamine rhodamine B, 5-(and-6)-caiboxyfluorescein, fluorescein-5-isothiocyanate CFITC), 7- 
diethylanMnocoimiarin-3-carboxylic acid, tetram^hylrhodamine-5-(and-6)-isothiocyanate, 5- 
10 • (and-6)-caiboxytetramethylihodamine, 7-hydroxycoumarin-3-carboxylic acid, 6-[fluorescein 5- 
(and-6)-caiboxamido]hexanoic acid, N-(4,4-difluoro-5,7-dimethyl-4-bora-3a,4a diaza-3- 
indacenepropionic acid, eosin-5-isothiocyanate, erythrosin-5-isothiocyanate, phycoerythrin (B-, 
R-, or cyanine-), allophycocyanin, Oregon Green™, or Cascade™ blue acetylazide (Molecular 
Probes, Inc., Eugene, OR). Such molecules allow the hybridization of the two populations of the 
1 5 nucleic acid molecules to be visualized without secondary detection molecules. 

Nucleic acid molecules also can be indirectly labeled with biotin or digoxygenin, 
although secondary detection molecules or further processing then may be required to visualize 
hybridization of the two populations of nucleic acid molecules. For example, a nucleic acid 
molecule indirectly labeled with biotin can be detected using avidin or streptavidin conjugated 
20 molecules (e.g., avidin or streptavidin conjugated antibodies). Digoxigenin labeled nucleic acids 
can be detected using anti-digoxigenin antibodies. Typically, the antibodies are conjugated to a 
reporter molecule such as an enzyme (e.g, alkaline phosphatase or horseradish peroxidase) or a 
detectable label (e.g., a fiuorophore). Enzymatic markers can be detected in standard 
colorimetric reactions using a substrate and/or a catalyst for the enzyme. Catalysts for alkaline 
25 phosphatase include 5-bromo-4-chloro-3-indolylphosphate and nitro blue tetrazolium. 
Diaminobenzoate can be used as a catalyst for horseradish peroxidase. 

Molecular beacons in conjunction with fluorescence resonance energy transfer (FRET) 
also can be used as a label. Molecular beacon technology uses a nucleic acid molecule labeled 
with a first fluorescent moiety and a second fluorescent moiety. The second fluorescent moiety 
30 is generally a quencher, and the fluoiescent labels are typically located at each end of the nucleic 
acid. Molecular beacon technology uses a nucleic acid having sequences that pemiit secondary 
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structure fonnation (e.g., a hairpin). As a result of secondary structure formation within the 
nucleic acid, both fluorescent moieties are in spatial proximity when the nucleic acid is in 
solution. After hybridization to tiie random nucleic acids, the secondary structure of the nucleic 
acid is disrupted and the fluorescent moieties become separated fiom one another such tiiat aft^ 
excitation with light of a suitable wavelength, the emission of the first fluorescent moiety can be 
detected. 

In some embodiments, the two populations of nucleic acid molecules are not labeled. 
Hybridization can be assessed using a labeled protein that can distinguish single strands from 
double strands (e.g., MutS). 

Before hybridization to the random nucleic acid molecules, the labeled nucleic acid 
molecules typically are fragmented into nucleic acid molecules ranging fiom 20 to SOO 
nucleotides in length. Fragments of SO to 200 nucleotides are particularly useful Fragmenting 
may not be necessary if the nucleic acids were produced using a reverse primer specific to the 
re^on of interest as such nucleic acids will be in an appropriate size range. 

Nucleic Acid Molecules From Viruses 

To assess viral quasispecies, nucleic acid molecules that encode viral polypeptides are 
Vised. A nucleic acid molecule can encode a full-length viral polypeptide or a variable portion 
thereof. For example, a nucleic acid molecule can encode hemaglutinin of influenza, Env of 
HIV, gpl20 of HIV, or El or E2 of hepatitis C, and variable portions of such polypeptides (e.g., 
hypervariable region 1 of E2 or a portion of hypervariable region 1). Nucleic acid molecules 
encoding viral polypeptides can be isolated fix>m biologi(;al samples and labeled using the 
techniques described above. In some embodiments, nucleic acid molecules can be obtained from 
multiple biological samples from the same subject at different time points (e.g., biological 
samples before and after antibody seroconversion). 

Random Nucleic Acid Molecules 

Random nucleic acid molecules typically are 10 to 50 nucleotides in length, and 
preferably, are 20 to 25 nucleotides in length. Random nucleic acids can be of unknown 
sequence with any one of four nucleotides at each position. Alternatively, random nucleic acid 
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molecules can have known sequences fix^m unrelated genes (i.e., non TCR and non-BCR genes, 
or non-viral genes). 

Populations of random nucleic acid molecules can be produced synthetically. Methods 
for synthesizing nucleic acid molecules are known in the art For example, nucleic acid 

■ 

5 molecules can be assembled by the cyanoethyl phosphoramidite method. See, for example, 
'^Oligonucleotide Synthesis: A Practical Approach," ed. M. J. Gait, IRL Press, 1984, 
WO92/09615; and WQ98/088S7 for a description of synthesis methods. Automated synthesizer 
machines can be used to produce random nucleic acid molecules. Such synthesizers are known 
and are available from a variety of companies including Applied Biosystems and Amersham 



The random nucleic acid molecules can be attached to a solid substrate. Suitable 

* 

substrates can be of any shape or form and can be constrocted from, for example, glass, siUcon, 
metal, plastic, cellulose, or a composite. For example, a suitable substrate can include a 
multiwell plate or membrane, a glass slide, a chip, or beads. Suitable beads can have an average 

15 diameter of about 2 pm to IS jxm and can be polystyr^e, ferromagnetic, or paramagnetic. For 
example, ftto beads can have an average diameter of about 4 pm to about 1 1 ^m. Typical 
average bead diameters are about 4-S [un, 7-8 pm, and 10-11 fun. Beads are available 
commercially, for example, from Spherotech Inc., Libertyville, IL. Nucleic acid molecules can 
be synthesized in situ, immobilized directly on the substrate, or immobilized via a linker, 

20 , including by covalent, ionic, or physical linkage. Liokers for immobilizing nucleic acids, 

including reversible or cleavable linkers, are known in the art. See, for example, U.S. Patent No. 
5,451,683 and WO98/20019. In some embodiments, the nucleic acid molecules are inunobilized 
in discrete locations on the solid substrate (e.g,, a chip). See, for example, U.S. Pat^t Nos. 
5,445,934 and 5,744,305. Such chips are commercially available, e.g., from Afiymetrix (Santa 

25 Clara, CA). 



Hybridization 

Hybridizations of the two populations of nuddic acid molecules typically are performed 
under highly stringent conditions. For example, nucleic acid molecules can be hybridized to a 



twice in 0.5X MES at 65°C. IX MES contains l.OM NaCl, 0.1 M MES pH 6.6, and 0.1% Triton 



10 



Pharmacia Biotech. 
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chip at 45^C in 2X MES (2-moipholinoethanesulfomc acid) for 12 to 16 hours then washed 
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XI 00. Hybridization conditions, including ionic strength of hybridization and wash solutions, 
tCTiperature of hybridization, length of hybridization, number of washes, and temperature of 
washes, can be modified to account for unique features of the nucleic add molecules, including 
Imgdi and overall sequence composition, and the type of solid substrate (e.g., beads or chips). 

5 Methods and systems for imaging samples containing detectable labels are commercially 

available. For example, a system that includes a scanner, flow cytometer, mass spectrometer, 
confbcal microscope, or real time thermocycler (e.g., with nucleic acid molecules labeled with 
molecule beacons) can be used to detect hybridization intensity. For example, when the nucleic 
acid molecules are attached to a bead and a fluorescmt label is used, flow' cytometry can be used 

1Q to detemiine the number and fluorescent intensity of the beads. 

Typically, the number of hybridizations with intensity above background (i.e., number of 
hits) can be summed. Backgroimd intensity is determined based on hybridization of a sample 
witii known diversity, hi some embodiments, background can be intensity data fix>m non- 
lymphoid cells. A standard curve in which samples with known numbers of different nucleic 

15 acid molecules are hybridized to a random population of nucleic acid molecules can be 

generated. Diversity of a particular sample can be determined by extrs^olating from flie standard 
curve. 

In some embodiments, hybridization patterns are analyzed using, for example, the 
"nearest shrunken centroid" method of analyzing microarrays. See, Tibshirani et al., Proc. Natl. 

20 Acad. Sci. USA. 99(10):6S67-72 (2002), for a description of nearest shrunken centroid analysis. 
In general, subsets of binding sites that best characterize each particular population of nucleic 
acid molecules (e.g., TCR) are identified. A standardized centroid is calculated for each class, 
where the standardized centroid is defined as the average binding site intensity for each binding 
site in each class divided by the within-sample standard deviation for that binding site. Nearest 

25 centroid classification takes the binding intensity profile of a new sample and compares it to each 
of these class centroids. The class whole centroid that it is closest to, in squared distance, is the 
predicted class for that new sample. Nearest shrunken centroid analysis implies the use of a 
threshold to fiirther shrink the number of binding site subsets for comparison. Determination of 
the threshold is based on cross-validated misclassification enror rate. Nearest shrunken centroid 

30 analysis may be particularly usefiil for track individual T cell or B cell clones or clusters of T or 
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B cell clones. See also, Slonim, Nat. Genet s 32 (Suppl.):502-8 (2002) for other data-analysis 
techniques for micioarray data analysis. 

Monitoring Disease 

Methods of the invention can be used to determine a subject's lymphocyte 
diversity/repertoire, or to monitor or track diseases in subjects, or to identify particular 
therapeutic regimens. For example, a subject's lymphocyte diversity can be compared with 
lymphocyte diversity of a control population. In some embodiments, the control population can 
be the subject's baseline lymphocyte diversity (e.g., before a particular procedure or before 
tceatm^t). An alteration in the subject's lymphocyte diversity relative to that of the control 
population indicates a change in the disease. For example, the method can be used to monitor 
immune reconstitution following bone marrow transplantation or intensive retroviral therapy. In 
these settings, a small number of clones might expand by homeostatic proliferation to yield 
normal lymphocyte numbers, but diversity might be altered. 

Loss of diversity has been implicated in various disease states. Thus, changes in diversity 
can be used to track the progression or remission of disease. Methods of the invention can be 
used to monitoring diseases such as autoinunime disorders (e.g., rheumatoid arthritis, multiple 
sclerosis, or insulin dependent diabetes mellitus type I), colitis, or lymphoid diseases such as 
leukemia or lymphoma). 

The methods described herein also can be used to track expanded T cell clones or clusters 
of clones over time based on gene chip hybridization pattem. In particular, expanded T cell 
clones seen in response to infection, transplantation or homeostatic proliferation may be tracked 
over time with this method* 

In addition, identifying and quantifying viral quasispecies can be used to guide 
therapeutic choices and make prognostic assessments for subjects. Persistence of some viral 
infections such as hepatitis and HIV is positively related to the diversity of the virus. For 
example, in a patient infected with hepatitis C, the quasispecies of the virus can determine 
whether the infection will resolve or become chronic. In subjects with revolving hepatitis, viral 
diversity decreases after seroconversion, while in subjects with chronic disease, viral diversity 
increases after seroconversion. 
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Articles of Manufacture 

Populations of random nucleic acids, solid substrates, or primers can be combined wi& 
packaging materials and sold as articles of manufacture or kits (e.g., for detemiining biologic 
diversity). Components and methods for producing articles of manu&ctures are well known. 

5 The articles of manufacture may combine one or more components described herein. For 
example, an article of manufacture can include a solid substrate with random nucleic acid 
molecules inmiobilized thereto and primers for producing specific nucleic acid molecules (e.g., 
nucleic acid molecules encoding a lymphocyte receptor or a portion thereof or a viral nucleic 
acid molecule). In addition, the articles of manu&cture fiulher may include reagents to label 

10 nucleic acid molecules, nucleic acids tiiat can serve as positive or negative controls, and/or other 
useful reagents for determining biologic diversity (e.g., reagents for preparing a standard curve). 
Nucleic acids that serve as positive or negative controls can be immobilized as a solid substrate. 
Instructions describing how a population of random nucleic acid molecules can be used to 
determine biologic diversity also can be included. 

15 The invention will be further described in the following examples, which do not limit the 

scope of the invention described in the claims. 

EXAMPLES 
Example 1 -» Methods and Materials 
Isolation of RNA, Spleens harvested firom mice were placed in RPMI and pushed 
20 through a 70 ^m cell strainer. Lymphocytes were isolated firom the resulting suspension of 

splenocytes or from peripheral blood using FicoU-Paque (Amersham Biosciences, Piscatanaway, 
NJ) gradient. Total RNA was isolated fit>m the lymphocytes using the Qiagen RNeasy kit 
(Qiagen, Inc., Valencia, CA) per the manu£Eu:turer*s instructions. Isolated RNA was resuspended 
at a concentration of 2 ^g/^1. 
25 Generation of CDR>3 specific cRNA: First strand cDNA was constructed first as follows. 

In an RNAse-firee microcentrifuge tube, 10 ^1 of total RNA (20 [ig) was mixed with 1 (il (100 
pmol/(il) of eitiher: 

T7+C jS, which binds to the constant region of the TCR p chain, 
5*-GGCCAGTGAATTGTAATACGACTCACTATAGGGAGGCGGCTTGGGTGGAG 
30 TCACATTTCTC-3* (SEQ ID NO: 1) for T cell receptor analysis, or 
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T7+CJh4 

5'-GGCCAGTGAAITGTAArACGACTCA<nV\T^ 

TGACTGAGGTTCCTTG-3* (SEQ ID NO:2) for B cell recq)tor analysis. The T7+CJh4 primer 
binds to the Jh region of the receptor. 

This mixture was incubated at 70 °C for 10 minutes followed by a quick spin and chill on 
ice. To this reaction, 4 ^1 of 5X first strand cDNA buffer (Livitrogen, Inc., Carlsbad, CA), 2 jil 
of 0.1 M DTT, and I ^1 of 10 mM dNTP mix were added and incubated at 37 for 2 minutes. 
Next, 2 \il of Superscript n RevCTse Transcriptase (Invitrogen, Inc.) was added and the total 
mixture was fiirther mcubated at 37 ^^C for 1 hour. 

Following incubation, the first strand product was placed on ice. For second strand 
synthesis, the following reagents were added to the first strand product: 91 jil of DEPC-treated 
water, 30 \il of 5X Second Strand Reaction Buffer (Invitrogen, Inc.), 3 jil of lOmM dNTPs, 1 jil 
of 10U/|il DNAligase, 4 |il of lOU/jil DNA polymerase I, and 1 jil of 2U/[il RNase H. The 
reaction was incubated at 16 ^'C for 2 hours in a cooling water bath. Following incubation, 2 |xl 
of lOU T4 DNA polymerase was added and the entire mixture was incubated for an additional S 
minutes at 1 S'^C. Finally, 10 |Jil of 0.5 M EDTA was added to the mixture. The completed 
double-stranded cDNA was purified using phase lock gel followed by phenol chloroform 
extraction. The double-stranded cDNA product was then biotinylated with Enzo, BioAiray High 
Yield RNA Transcript Labeling Kit per the manufacturer's instructions. The in vitro 
transcription product (cRNA) was purified using RNeasy spin columns (Qiagen, Inc.) per the 
manufacturer's instructions. The purified product was quantified using spectrophotometric 
analysis applying the convention that 1 CD at 260 ran equals 40 |ig/ml of RNA. cRNA was 
resuspended at a concentration of 1 fxg/^L cRNA was then fragmented to 50-200 bp sizes by 
combining with 5 jil of 5X fragmentation buffer (Invitrogen, Inc.) in 15 jil of water. The mixture 
was incubated at 94^C for 35 ntiinutes and put on ice following incubation. Equal amoimts of 
cRNA from different samples were hybridized to gene chips as described below. 

A pplication of cRNAto the gene chip. Gene Orips® were purchased from AflBrmetrix, 
Inc. (Santa Clara, CA) and prepared for hybridization of cRNA. While the ideal gene chip would 
be synthesized with randomly generated oligonucleotides, it was reasoned that chips containing 
known but unselected expressed sequence tags from human genes would share less homology 
with mouse CDR-3 RNA and could be used instead. Accordingly, the human U95B chip was 
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used. As tiiese chips were initiaUy developed for an eatiiely different purpose, differences in 
div^ity less than one order of magnitude may not be detected. Comparing such differences in 
diversity can be accomplished with a larger panel of standards; however, there is little evidence 
such differences matter biologically. 

Data Analysis. For each gene chip experiment, raw data represrating oligo location and 
hybridization intensity were obtained. Data were arranged in order of ascending hybridization 
intensity. The number of oligo locations with mtensity above background (i.e., number of hits) 
was summed. First, the standard curve was generated (from hybridization of samples with 
known numbCTS of diflEarent oligos). Next, test samples were assessed and based on the number 
of hits, the diversity was extrapolated from the standard curve. 

Example 2 - Standardization of Analysis of L vmphocvte 
Receptor Diversity Using Gene Chins 
As a first test of the concqit, it was assessed whether the diversity of random 
oligonucleotides could be predicted by the number of sites hybridized on a gene chip. To test 
this question, an oligonucleotide 18 nucleotides in length (an 18 mer) was designed (similar in 
sequence to an average T cell receptor CDR3 region) and then random point assignments were 
inserted at specific locations. For example, to generate a sample witti -10^ different 
oligonucleotides, an 18 mer was synthesized with 10 sites of random base assigmnent (4^^ = 
1,048,576). Samples were synthesized with 1, 103, 106, and 109 different oligonucleotides per 
sample. The oligonucleotides were biotinylated and then 10 jig of each was hybridized to 
separate gene chips under similar stringmcy conditions to those for conventional applications 
(hybridization in 2X MES for 12 to 16 hours at 45*^0, washed twice in 0.5X MES at 65^C). 
Following hybridization, gene chips were stained with streptavidin phycoerythrin and then 
scanned using GeneChip software yielding intensity at specific probe locations on the gene chip. 
Hybridization intensity data were arranged in ascending order. The number of probe locations 
with intensity above background (i.e., number of hits) was summed and compared to the number 
of different oligos initially applied to the gene chip (i.e., number of variants). Background was 
50. Scans ofthe gene chips afford rapid inspection ofthe "hit" profile. As predicted, the 
number of hybridized sites increased with increasing numbers of different oligonucleotides (FIG 
1 A), indicating that a human gene chip could be used to detect random oligonucleotides. Due to 
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the exponential nature of the relationship, the natural logarithm of both variables was taka and 
plotted (FIG IB). The ti&oA (i.e., slope of the standard curve) was highly r^roducible, however 
overall hybridization intensity (i.e., y int^xept of the standard curve) varied fix>m experiment to 
experiment (FIG 2). This variability reqmres use of a standard curve for each experiment 
conducted. 

Example 3 - Analysis of Murine B Cells 
To test whether the method of ETcample 2 could measure variations in lymphocyte 
diversity, it was applied to flie study of B cells in mice. Murine B cells were used for this 
purpose because diversity can be measured, at least in principle, through analysis of 
immunoglobulin (Ig) proteins and because of the availability of mutant mice with defined 
variations in B cell antigen receptor repertoire. Diversity of B cell antigen recq>tors was 
compared in wild type (C57B1/6) mice with the in quasi-monoclonal (QM) and monoclonal B-T 
OMBT) mice. The QM mice were generated by gene-targeted replacement of the endogenous JH 
elements with a VDJ rearranged region from a (4-hydroxy-3-nitrophenyl) acetyl (NP)-specific 

m 

hybridoma. Cascalho et al.. Science 272:164901652 (1996); and Cascalho et al., J. Immunol. 
159:5795-5801 (1997). The kappa light chams in these mice are non-flmctional and therefore the 
knock-in heavy chain can only pair with endogenously rearranged lambda chains. All B cells in 
QM^mice start out with the same heavy chain, however secondary rearrangement and 
hypermutation change the specificity of 20% of B cell receptors in the periphery. Cascalho et al., 
J. Immunol., supra. Thus, 80% of the peripheral B cells in the QM mice express antibodies 
containing the same single heavy chain, and the remaining 20% e:q>ress antibodies expressing 
diverge heavy chains. The MBT mouse, was produced by breeding recombinase-deficient mice 
expressing the DO. 11 TCR transgene with recombinase deficient mice with a monoclonal B cell 
compartment that produces antibodies specific for (4-hydroxy-3-nitrophyl) acetyl. Casc^ho et 
al.. Science supra; Keshavarzi et al., Scand. J. Tin niunnl., in press. JH -/- mice have a targeted 
deletion of the JH and of the J kappa gene segments and therefore cannot assemble Ig heavy or 
kappa light chains. Chen et al., htt, hnmunol. 5:647-56 (1993). These animals are B cell 
deficient although they do have precursor B cells 03220+/CD43+) that assemble lambda light 
chain genes at a low level. 
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To test the method of measuring lyn^hocyte diversity, splenocytes were harvested from 
3-4 week old JH-/-, MBT, QM and WT mice and mononuclear cells w^e isolated on Ficoll* 
paque gradients. Total RNA was isolated from the lymphocytes and first strand cDNA was 
generated using a primer designed to bind the constant region of the mouse heavy chain J region 

5 plus the T7 polymerase promoter. The custom primer promoted amplification of heavy chain- 
specific RNA only. Equal amounts of the in vitro transcription product (cRNA) from each • 
mouse and standards (-•-) were hybridized to gene chips and then the chips were stained and 
analyzed as described above in Example 2. The hybridization intensities obtained fix>m the JH 
-/- mice (which lack B cell receptors) were used to set the background threshold, above which 

10 hybridization sites were counted. As the results shown in FIG 3 indicate, wild type mice 

expressed more than 10^ (2.8 x 10^) different B cell heavy chain clones, QM mice ^q>ressed 3.9 
X 10^ different heavy chain clones and, as expected, MBT approximately 1 heavy chain clone. 

Example 4 - Analysis of Human T Cells 

15 To determine if the gene chip method could measure T cell diversity of humans, 

peripheral blood lymphocytes (PBL) were isolated from five donors (two normal individuals; 
two thymectomized individuals, and one individual with IBD). Total RNA was isolated from the 
hirnian peripheral blood lymphocytes or Jurkat cells and first strand cDNA was generated using a 
primer designed to bind the constant region of the TCR beta chain. The custom primer promoted 

20 amplification of TCR-specific KNA only. Equal amounts of the in vitro transcription product 
(cRNA) from each sample and standards were hybridized to gene chips and then the chips were 
stained and analyzed as described in Example 2. 

Juzkat ceUs express only one TCR (Val.2V/38.1) and so the hybridization intensities of 
this sample were used to establish the background threshold. The beta chain diversity of the 

25 normal mdividuals was 4.4 x 10^ and S.l x 10^, respectiveljr, the beta chain diversity of the 

thymectomized individuals was 2.2 x 10^ and 1.8 x 10^, respectively; and the beta chain diversity 
of the individual with IBD was 2.6 x 10^. The values are consistent with estimates of CDR3 
diversity deduced by other means (Wagner et al., Proc. Natl. Acad, Sci. USA 95:14447-52 
(1998); Correi-Neves et al.. Immunity, 14:21-32(2001)) and taken with estimates of c£-chain 

30 diversity and pairing (Arstila et al.. Science. 286:958-61 (1999)), would place overall T cell 
diversity near 10^. Similar results were obtained for T cell diversity in mice. 
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OTHER EMBODIMENTS 
It is to be undeistood that while the inv^tion has be^ described in conjunction with the 
detailed description feereo^ the foregoing description is mtended to illustrate and not limit the 
scope of the invration, which is defined by the scope of the sppraded claims. Other aspects, 
5 advantages, and modifications are wittiin the scope of the following claims. 
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WHAT IS CX.AIMED IS: 



1. 



A mettiod for detomining lymphocyte diversity in a subject, said mettiod 



■ • 




labeled nucleic acid molecules wifh a population of random nucleic acid molecules; 



hybridization of said labeled nucleic acid molecules with said population of random 
nucleic acid molecules. 

2. The methdd of claim 1 , wherein said random nucleic acid molecules within 
said population are attached to a solid substrate. 



3. The method of claim 2, wherein said solid substrate is a bead. 

4. The method of claim 3, wherein hybridization is assessed by flow cytometry. 

5. The method of claim 2, wherein said sohd substrate comprises a plurality of 
discrete regions, wherein each of said discrete regions comprises a different random nucleic 
acid molecule. 

6. The method of claim 1 , wherein said labeled nucleic acid molecules are 
labeled with a fluorochrome. 



7. The method of claim 6, wherein said fluorochrome is fluorescein 
isothiocyanate (FITQ, phycoerythrin (PB), allophycocyanin (APC), or peridinin chlorophyll 
protein (PerCP). 



and 



c) 



determining lymphocyte div^ity of said subject by assessing 
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8. The method of claim 1, wherein said labeled nucleic acid molecules are 
labeled with biotin. 

9. The method of claim 1 , wherein said labeled nucleic acid molecules are 
5 labeled with an enzyme. 

10. The method of claim 1, wherein said population of labeled nucleic acid 
molecules comprises labeled KNA molecules. 

10 11. The method of claim 1 , wherein said population of labeled nucleic acid 

molecules comprises labeled DNA molecules. 

12. The method of claim 1, wherein said population of lymphocytes are T 
lymphocytes. 

15 

13. The method of claim 12, wh^em said labeled nucleic acid molecules encode a 
variable region from a T cell recq>tor. 

14. The method of claim 12, wherein said labeled nucleic acid molecules encode a 
20 complementarity determining region (CDR) 3 P chain polypeptide. 

15. The method of claim 1 , wherein said population of lymphocytes are B 
lymphocytes. 

25 16. The method of claim 15, wherem said labeled nucleic acid molecules encode a 

variable region from a heavy chain or a Ught chain. 

17. A method for monitoring a disease in a subject, said method comprising 
a) obtaining labeled nucleic acid molecules from a population of said 
30 subject's lymphocytes, wherein each said labeled nucleic acid molecule encodes a 

lymphocyte receptor or a portion thereof; 
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b) hybridizing said labeled nucleic acid molecules or fragments of said 
labeled nucleic acid molecules with a population of random nucleic acid molecules; 

c) detemiining lymphocyte diversity of said subject by assessing 
hybridization of said labeled nucleic acid molecules with said population of random 
nucleic acid molecules; and 

d) comparing said subject's lymphocyte diversity with lymphocyte 
diversity of a control population, wherein an alteration in said subject's lymphocyte 
diversity relative to that of said control population indicates a change in said disease. 

18. The mettiod of claim 17, wherein an increase in said subject's lymphocyte 
diversity indicates a positive change in said disease. 

* 

19. The method of claim 17, wherein a decrease in said subject's lymphocyte 
diversity indicates a negative change in said disease. 

20. The method of claim 17, wherein said disease is an autoinomune disorder. 

21. The method of claim 20, wherein said autoimmune disorder is rheimiatoid 
arthritis or multiple sclerosis. 

22. The method of claim 20, wherein said disease is colitis. 

* 

23- The method of claim 17, wherem said disease is a lymphoid disease. 

24. The method of claim 23 , wherein said disease is leukemia. 

« 

25. The method of claim 23, wherein said disease is lymphoma. 

26. A method for determining vubI diversity in a subj ect, said method comprising 
a) obtaining labeled nucleic acid molecules from a biological sample of 

said subject, wherein said labeled nucleic acid molecules encode a viral polypeptide; 
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b) ^ hybridizmg said labeled nucl^c acid molecules or fragments of said 
labeled nucleic acid molecules with a population of random nucleic acid molecules; 

and , 

c) detemiming viral diversity of said subject by assessing hybridization 
of said labeled nucleic acid molecules with said population of random nucleic acid 
molecules. 

27. An article of manufacture comprising a) a solid substrate comprising random 
nucleic acid molecules immobilized thereto; and b) a primer for producing nucleic acid 
molecules encoding a lymphocyte recq[>tor or a fragment thereof or a primer for producing 
nucleic acid mplecules encoding a viral polypeptide. 

28. The article of manufacture of claim 27» wherein said solid substrate is a bead. 

29. The article of manu&cture of claim 27, wherein said solid substrate is a chip. 
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ABSTRACT 

Methods for detennining biologic diversity (e.g., lymphocyte receptor diversity or 
diversity of viral quasispecies) in a subject are described, as well as methods for monitoring a 
disease in a subject 
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